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1. INTRODUCTION 




ABSTRACT 


Higher education institutions want not only to provide quality education to its 
students but also to advice career options according to the prediction of 
students' performance. The students' satisfactory performance takes an 
important role to give birth the best quality graduates who will become 
competent laborers for the country's economic and social development [2]. 
Students' performance like who will pass and who are likely to fail can be 
predicted with the help of lots of features available. The students want to 
realize their final performance before the announcement of their results and 
before they attend their semester exams. According to their predicted 
performance, the students can improve their skills by proper planning to lead 
to a good performance in their end examination. To provide a good advice to 
such kind of student, educational data mining system is implemented to 
predict students' final performance evaluated by considering factors which 
include IM, PSM, Basics, ACIC, ASS, CP, ATT, ACOC and ESM. In this research, an 
attempt has been made to explore Naive Bayesian classification to predict the 
students' future performance. 


KEYWORDS: Educational Data Mining, Naive Bayesian Classification\, Prior 
Probability 


In higher learning institutions, students' performance is an 
essential part. This is because one of the criteria for a high 
quality university is depended on its excellent record of 
academic achievements [14]. 


is one of the main aspects considered by the company [2]. 
Thus, students have to try with the greatest effort in their 
study to obtain a good performance in order to fulfill the 
company demand. 


At present, there are many techniques being proposed to 
evaluate students' performance. Data mining is one of the 
most popular techniques to evaluate students' performance. 
Data mining uses a combination of a vast knowledge base, 
advanced analytical skills, and domain knowledge to detect 
hidden trends and patterns which can be used in almost any 
sector ranging from business to medicine, then to Education. 
Nonetheless educational institutes can apply data mining to 
determine valuable information from their databases known 
as Educational Data Mining (EDM) [8]. It aims at devising 
and using algorithms to predict the student's future 
performance and improve educational results for further 
decision making [6]. 

Nowadays, where knowledge and quality are needed as 
critical factors in the global economy, Higher Education 
Institutes (HEI) as knowledge centers and human resource 
developers take part in a vital role. Thus, it is important to 
ensure the quality of the educational processes and to 
classify the performance of students [16]. The students' 
satisfactory performance takes an important role to give 
birth the best quality graduates who will become competent 
laborers for the country's economic and social development. 
Recruiting competent laborers especially the fresh graduates 


Data Mining task can be divided into two categories: 
Descriptive and Predictive. Descriptive mining tasks 
characterize properties of the data in a target data set. 
Predictive mining tasks execute induction on the current 
data in order to make predictions [10]. Predictive mining 
task is clustering, prediction and descriptive mining task is 
association rule and summarization. 

Data mining techniques can contribute for future prediction 
about students' performance. The main goal of education 
system is to provide the quality education to students. 
Providing a high quality of education depends on predicting 
the unmotivated students before they entering in to final 
examination. For the improvement and development in 
education system, data mining can be very suitable [7]. 
Present paper is designed to mine the educational domain 
using Bayesian classification to predict the future 
performance of Engineering Students from LakiReddy Bali 
eddy College of Engineering, Dept of IT, Mylavaram from 
2012 to 2016 [19]. 

There is no absolute scale for measuring knowledge but 
examination score is one scale which offers the performance 
indicator of students. Students' academic achievement is 


@ IJTSRD | Unique Paper ID - IJTSRD26642 | Volume - 3 | Issue - 5 | July - August 2019 


Page 1248 









International Journal of Trend in Scientific Research and Development (IJTSRD] @ www.ijtsrd.com elSSN: 2456-6470 


measured by the internal assessment and end semester 
examination. The internal assessment is carried out by the 
teacher based upon students' performance in educational 
activities such as the basic knowledge of the subject, ability 
to concentrate in the class, assignment, content perception, 
attendance and awareness on course. 

Data mining provides many tasks that could be applied to 
study the students' performance. In this research, the 
classification task is used to evaluate student's performance 
of the final semester. Students' information like previous 
semester marks and educational activities marks were 
collected from the student's database system, to predict the 
performance at the end of the semester examination [19]. 

2. Related Work 

Various data mining technique are used to analysis the 
academic performance of students at various levels, 
followings are the few of some especially used for academic 
progression in various modes. 

Jayaprakash and Balamurugan [12] presented a system in 
which the naive Bayes algorithm is applied to predict 
students' academic performance in end-of-semester 
examinations by analysing student feedback and their 
performance in mid-semester exams. This system provides 
educational institutions to identify the weaker students in 
advance and arrange necessary training before they sit for 
their final exams. 

Ayesha, Mustafa, Sattar, M. Inayat Khan [23] used Bayesian 
Classification Method as a data mining technique and 
suggested that students' grade in senior secondary exam, 
living location, medium of teaching, mother's qualification, 
students' other habits, family annual income and students' 
family status were highly correlated with the student 
academic performance. 

Al-Radaideh, Q., Al-Shawakfa, E. and Al-Najjar, M. [1] 
proposed the classification as data mining technique to 
evaluate students' performance. Decision tree method is 
applied for classification. This study supports earlier in 
identifying the dropouts and students who need special 
attention and allow the teacher to provide appropriate 
advising. 

M. Wook, Y. Hani Yamaya, N. Wahab, M. Rizal Mohd Isa, N. 
Fatimah Awang and H. Yann Seong [15] compared two data 
mining techniques which are: Artificial Neural Network and 
the combination of clustering and decision tree classification 
techniques for predicting and classifying student's academic 
performance. As a result, the technique that provides 
accurate prediction and classification was selected as the 
best model. Using this model, the pattern that influences the 
student's academic performance was identified 

S. Kumar Yadav, B. Bharadwaj and S. Pal [22] collected the 
university student data such as attendance, class test, 
seminar and assignment marks from the students' database. 
They used three algorithms ID3, C4.5 and CART to predict 
the performance at the end of the semester and concluded 
that CART is the best algorithm for classification of data. 

N. Thai Nghe, P. Janecek and P. Haddawy make a comparison 
of the accuracy of decision tree and Bayesian network 


algorithms for predicting the academic performance of 
under graduate and postgraduate students at two very 
different academic institutes. These predictions are most 
suitable for identifying and assisting failing students, and 
better determine scholarships. According to the result, the 
decision tree classifier provides better accuracy in 
comparison with the Bayesian network classifier [18]. 

Shaziya, Zaheer, and Kavitha [20] introduced an approach to 
predict the performance of students in per-semester exams 
by using naive Bayes classifier. The main goal is to know the 
grades that students may obtain in their end-of-semester 
results. This approach helps the educational institution, 
teachers, and students, i.e., all the stakeholders take part in 
an educational system. They can profit from the prediction of 
students' results in a multitude of ways. Students and 
teachers can take required actions to improve the results of 
those students whose result prediction is not satisfactory. 

Bharadwaj and Pal reviewed the university students data 
like attendance, class test, seminar and assignment marks 
from the students' previous database, to predict the 
performance at the end of the semester [5]. 

The proposed system used a training dataset of engineering 
students to build the Naive Bayes model. Then, the model 
predicts the end-semester results of students by applying the 
test data. In this approach, a number of attributes is selected 
to predict the final grade of a student. 

3. Data mining definition and techniques 

Data mining also termed as Knowledge Discovery in 
Databases (KDD] refers to extracting or "mining" knowledge 
from large amount of data [13]. Fig.l presents howto extract 
the well-defined pattern as a result of mining the data. 



Fig. 1 - Conversion of data into a pattern 


Knowledge Discovery process comprises various steps like 
Data cleaning, Transformation, Data mining, Pattern 
evaluation in extracting knowledge from data. Knowledge 
Discovery is associated with a multitude of tasks such as 
association, clustering, classification, prediction, etc. 
Classification and prediction are functions which are utilized 
to create models that are designed by analyzing data and 
then used for assessing other data. Classification techniques 
can be used on the educational data for predicting the 
students' behaviors, performance in examination etc. Basic 
techniques for classification are Decision Tree induction, 
Bayesian classification and neural networks. A number of 
well-known data mining classification algorithms such as 
ID3, REPTree, Simplecart, J48, NB Tree, BFTree, Decision 
Table, MLP, Bayesnet, etc., exist [11]. 

Schools and Universities apply Data mining as a powerful 
new technology with great potential to focus on the most 
important information in the data they have collected about 
the behavior of their students and potential learners [9]. 
Data mining associates with the use of data analysis tools to 
discover previously unknown, patterns and relationships in 
large data sets. These tools consist of statistical models, 
mathematical algorithms and machine learning methods. 
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These techniques are able to identify information within the 
data that queries and reports can't effectively reveal [6]. 

3.1 Data preparation 

The data set applied in this study was collected from 
LakiReddy Bali reddy College of Engineering, Information 
Technology department, Mylavaram from session 2012 to 
2016. 

The experiment was carried out using the data set with 28 
records using only 8 high impact attributes (Internal Marks 
(IM), Previous Semester Marks (PSM), Basic knowledge of 


the subject (Basics), Ability to concentrate in the class(ACIC), 
Assignment(ASS), Content Perception(CP), Tutorial(TUT), 
Attendance (ATT), Course Outcome Awareness (ACO)) and 
9 th attributes represents the unknown "End Semester Marks 
(ESM)” attribute that is to be predicted by the algorithm 

[19]. 

3.2 Data selection and transformation 

In this step, only those fields were chosen which were 
required for data mining. The data values for some of the 
variables were defined for the present analyses which are 
described in Table 1 for reference. 


Table!. Simple record of students' data set 


Variable 

Description 

Possible Values 

IM 

Internal Marks 

{A>=60% B>=45 & <60% 0=36 & <45% Fail<36%} 

PSM 

Previous Semester 

(Fail <36%, Average > = 36% and < 45%, Good > = 45% and<60%, 

Marks 

Excellent >=60%) 

Basics 

Basics in the subject 

(Weak, Average, Strong) 

ACOC 

Ability to 

Concentrate in 
the Class 

(Weak, Average, Strong) 

ASS 

Assignment 

(Yes - student completed assignment work assigned by teacher, No 
- student not completed assignment work assigned by teacher) 

CP 

Content Perception 

(Weak, Average, Strong) 

ATT 

Attendance 

{A,B,C} 

Awareness 

Course Outcomes 

{Yes, No} 

on CO's 

Awareness 

TUT 

Tutorial 

{Yes, No) 

ESM 

End Semester Marks 

(Fail <36%, Average > = 36% and < 45%, Good > = 45% and<60%, 
Excellent >=60%) 


3.3 Naive Bayes Classifier Algorithm 

Naive Bayes is a statistical classifier that can be applied to 
predict the probability of membership in a class. Naive Bayes 
theorem has similar classification capabilities to the decision 
tree and neural network. Naive Bayes is validated to have 
high accuracy and speed when applied to databases with 
large amounts of data [3]. Naive Bayes is based on the 
simplifying assumption that attribute values are 
conditionally independent if given an output value. In other 
words, given the value of output, the probability of observing 
collectively is the product of the individual probabilities [3]. 
The advantage of using Naive Bayes is that this method 
requires only a small amount of training data to determine 
the estimated parameters required in the classification 
process. Naive Bayes often works much better in most 
complex real-world situations than expected. Bayes's 
theorem says: 


f(»/*) = =2 


( 1 ) 


where X is data of an unknown class, H is the hypothesis that 
X is from a specific class, P(H/X) is the probability of 
hypothesis H based on condition X t P(H ) is the Probability 
hypothesis //(prior prob.), P(X/H ) is the probability of X 
under these conditions, and P(X ) is the probability of X [3]. 


The Bayesian classifier works as follows: 

1. Let D be a training set of tuples and their class labels. 
Each tuple is represented by n-dimensional attributes 
vector, X = (x lf x 2 , —,x n ), depicting n measurements 
made on the tuple from n attributes, respectively, 
Ai, A 2 ,..., A n . 


2. Suppose, there are m classes, C lt C 2 ,..., C m . Given a 
tuple, X, the classifier will predict that X belongs to the 
class having the highest posterior probability, 
conditioned on X belongs to the class C t if and only if 
P(Ci/X ) > P(Cj/X ) for 1 « j « m,j ^ i. Thus we 
maximize P(Ci/X ).The class Q for which P(Q/Y) is 
maximized is called maximum posterior hypothesis. 

3. As P(X) is constant for all classes, only P(Y/£))P(£)) 
need be maximized. If the class prior probabilities are 
not known, then it is commonly assumed that the classes 
are equally likely, that is, P(C X ) = P(C 2 ) =•••.= 
P(C m ), and there will be maximization of P(X/ 
Q). Otherwise, maximization will be P(X/C^P^Ci). 

4. Given data sets with many attributes, it would be 
extremely computationally expensive to compute P{X/ 
Q). In order to reduce computation in evaluate P{X /Q), 
the Naive assumption of class conditional independence 
is made. This presumes that the values of the attributes 
are conditionally independent of one another, given the 
class label of tuple. Thus, 

P(X/C t ) = P(X k /C t ) (2) 

5. In order to predict the class label ofX, P(X/Q)P(Q) is 
evaluated for each class Q. The classifier predicts that 
the class label of tuple X is the class C t if and only if 
P(X/Ci)P(Ci) > P(X/Cj)P(Cj) for 1 « j « m,j ± i. In 
other words, they predict class label is the class C t for 
which P(A r /Q)P(Q) is the maximum [21] 

3.4 Experimental Results 

In this paper, Naive Bayes classification algorithm can be 

applied to predict the class label of "End Semester Marks 

(ESM)" with the help of training data given in Table 2. There 
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are 14 data sets belonging to the class "Average" and 14 data 
sets belonging to class "Fail". The data tuples are expressed 
by the attributes of Internal Marks (IM), Previous Semester 
Marks (PSM), Basic knowledge of the subject (Basics), Ability 
to concentrate in the class (ACIC), Assignment (ASS), Content 
Perception (CP), Attendance (ATT), Course Outcome 
Awareness (ACO) and Tutorial (TUT). The class label 
attribute "End Semester Marks (ESM)" has two distinct 
values namely (Average, Fail}. The prediction of any new 
student is shown in Table 3. 


14 

P(.ASS = No/ ESM = Fail ) = — = 1 

6 

P(TUT = No/ESM = Avg ) = — = 0.428571 

14 

P(TUT = No/ESU = Fail ) = — = 1 

2 

P(ATT = C/ESU = Avg ) = — = 0.142857 

10 

P(ATT = C/ESU = Fail ) = — = 0.714286 


We need to maximize P(X/C{) for i=l,2, P(Q), the prior 
probability of each class, can be computed based on the 
training data set. 

14 

P(ESM = Avg) = — = 0.5 
N 14 

P(ESM = Fail) = — = 0.5 

2 o 

To compute P(X/C{) for i=l,2, P(Ci), we compute the 
following probabilities: 

14 

P( ACO = No/ESU = Avg) = — = 1 

10 

P( ACO = No/ESU = Fail) = — = 0.714286 

12 

P(Baics = Avg/ESU = Avg) = — = 0.857143 

14 

10 

P(Baics = Avg/ESM = Fail ) = — = 0.714286 
P(ACIC = Strong/ESM = Avg ) =^ = 0.142857 
P(ACIC = Strong/ESM = Fail ) = ^ = 0.142857 

io 14 

P(CP = Avg/ESM = Avg ) = — = 0.714286 

14 

2 

P( CP = Avg/ESU = Fail) = — = 0.142857 

8 

P(IM = B/ESU = Avg) = — = 0.571429 

1 

P(IM = B/ESU = Fail) = — = 0.071429 

6 

P(PSM = Fail/ESU = Avg) = — = 0.428571 

14 

P(PSM = Fail/ESU = Fail) = — = 1 

6 

P( ASS = No/ESM = Avg) = — = 0.428571 


Using above probabilities, we obtain 

P(New student/ESM = Avg) = P(ACO = No/ESM = 

x PBaics =Avg/FSM=Avg x 
P(ACIC — Strong/ESM = Avg) x P(CP = Avg/ESM = 
Avg x P/M=B/ESM=Apg x PPSM=Pail/ESM=Avg x 
P(ASS = No/ESM = Avg) x P(TUT = No/ESM = Avg) x 
P(A7T = C/ESM = Avg) 

= 1 x 0.857143 x 0.142857 x 0.714286 x 0.571429 
x 0.428571 x 0.428571 x 0.428571 x 0.142857 
= 0.000562029 

Similarly, we can find out 

P(New student/ESM = Fail) = P( ACO = No/ESM = FaiZ)x 
P(Baics = Avg/ESM = Fail) xP(ACIC = Strong /ESM = 
/#// x AC P =Av < g/E S M =Eail x AIM =B/E S M = Bail x 
P(PSM = FaiZ/ESM = Fai/)x P(ASS = No/ESM = Fail) x 
F(TUT = No/ESM = Fail) x F(ATT = C/ESM = Fail) 

= 0.714286 x 0.714286 x 0.142857 x 0.142857 x 
0.071429 x 1 x 1 x 1 x 0.714286 
= 0.000531244 

To find the End Semester marks Ci that maximize P(X/Ci) 
P(Ci), we compute 

P(New student/End Semester marks 

= Avg) x P(End Semester marks = Avg) 

= 0.000562029x0.5 
= 0.000281015 

P(New student/End Semester marks = 

Fail) x P(End Semester marks = Fail) 

= 0.000531244x0.5 

= 0.000265622 


Table2. Data Set for Engineering Student Data set 


S. no. 

ACO 

Baics 

ACIC 

CP 

IM 

PSM 

ASS 

TUT 

ATT 

ESM 

1 

No 

Avg 

Avg 

Avg 

A 

Avg 

No 

Yes 

A 

Avg 

2 

Yes 

Avg 

Avg 

Strong 

C 

Fail 

No 

No 

C 

Fail 

3 

No 

Avg 

Avg 

Weak 

C 

Fail 

No 

No 

B 

Fail 

4 

No 

Avg 

Strong 

Strong 

B 

Fail 

No 

No 

B 

Avg 

5 

No 

Avg 

Avg 

Avg 

B 

Fail 

Yes 

No 

C 

Avg 

6 

No 

Avg 

Weak 

Avg 

C 

Fail 

No 

No 

B 

Fail 

7 

No 

Avg 

Avg 

Avg 

A 

Avg 

Yes 

No 

A 

Avg 

8 

No 

Avg 

Avg 

Strong 

A 

Avg 

Yes 

Yes 

A 

Avg 

9 

No 

Avg 

Strong 

Strong 

C 

Fail 

No 

No 

C 

Fail 

10 

No 

Weak 

Weak 

Weak 

C 

Fail 

No 

No 

C 

Fail 

11 

No 

Weak 

Weak 

Avg 

B 

Fail 

No 

Yes 

B 

Avg 

12 

No 

Weak 

Weak 

Weak 

D 

Fail 

No 

No 

C 

Fail 

13 

No 

Avg 

Avg 

Avg 

B 

Avg 

Yes 

Yes 

B 

Avg 

14 

Yes 

Avg 

Avg 

Strong 

C 

Fail 

No 

No 

C 

Fail 

15 

No 

Avg 

Avg 

Avg 

A 

Avg 

No 

Yes 

A 

Avg 

16 

Yes 

Avg 

Avg 

Strong 

B 

Fail 

No 

No 

C 

Fail 

17 

No 

Avg 

Avg 

Weak 

C 

Fail 

No 

No 

B 

Fail 
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18 

No 

Avg 

Strong 

Strong 

B 

Fail 

No 

No 

B 

Avg 

19 

No 

Avg 

Avg 

Avg 

B 

Fail 

Yes 

No 

C 

Avg 

20 

No 

Avg 

Weak 

Avg 

C 

Fail 

No 

No 

B 

Fail 

21 

No 

Avg 

Avg 

Avg 

A 

Avg 

Yes 

No 

A 

Avg 

22 

No 

Avg 

Avg 

Strong 

A 

Avg 

Yes 

Yes 

A 

Avg 

23 

No 

Avg 

Strong 

Strong 

C 

Fail 

No 

No 

C 

Fail 

24 

No 

Weak 

Weak 

Weak 

C 

Fail 

No 

No 

C 

Fail 

25 

No 

Weak 

Weak 

Avg 

B 

Fail 

No 

Yes 

B 

Avg 

26 

No 

Weak 

Weak 

Weak 

D 

Fail 

No 

No 

C 

Fail 

27 

No 

Avg 

Avg 

Avg 

B 

Avg 

Yes 

Yes 

B 

Avg 

28 

yes 

Avg 

Avg 

Strong 

C 

Fail 

No 

No 

C 

Fail 


Table3. Data set for a New Student 


§vco 

Baics 

ACIC 

CP 

IM 

PSM 

ASS 

TUT 

B\tt 

FSM 

No 

Avg 

Strong 

Avg 

B 

Fail 

No 

No 

c 

? 


In this way, the Bayesian classifier reliably predicts the new 
student will achieve the class "Average" in the End Semester. 
In the same manner another new students can be predicted 
to their respective performance based on the previous 
performance. 

4. Conclusion 

Data mining is a collection of algorithms that is employed by 
office, governments, university and corporations to predict 
and establish trends with specific purposes in mind. In this 
paper, Bayes algorithm is applied to explore the possibility of 
predicting student' final performance based on the 
information like Internal Marks (IM), Previous Semester 
Marks (PSM), Basic knowledge of the subject (Basics), Ability 
to concentrate in the class (ACIC), Assignment (ASS), Content 
Perception (CP), Attendance (ATT), Course Outcome 
Awareness (ACO) and Tutorial (TUT). This proposed system 
will help to the students and the teachers to enhance the 
students' final performance in the future assessment. This 
study will also work to identify those students which needed 
to try best to improve their performance to pass the 
examination and to get the good career. 
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